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Abstract 

In this paper we test the random walk hypothesis on the high frequency dataset of 
the bid~ask Deutschemark/US dollar exchange rate quotes registered by the inter- 
bank Reuters network over the period October 1, 1992 to September 30, 1993. 

Then we propose a stochastic model for price variation which is able to describe 
some important features of the exchange market behavior. Besides the usual correla- 
tion analysis we have verified the validity of this model by means of other approaches 
inspired by information theory . 

These techniques are not only severe tests of the approximation but also evidence 
some aspects of the data series which have a clear financial relevance. 

JEL classification: F31 
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I Introduction 



Any financial theory begins introducing a reasonable model for price variation 
in terms of a suitable stochastic process. In this paper we want to select an 
asset-pricing model able to describe some features interesting from a financial 
point of view at weaklevel, i.e. including only information arising from analysis 
of historical prices and not including other pubblic (see for example Cusatis 
et al. (1993), Asquith (1983), Ritter (1991), Loughran and Ritter (1995)) or 
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private (see Ito et al. (1998)) information. 

To attain such an aim we analyze the foreign exchange market which presents 
several advantages compared with other financial markets. First of all it is a 
very liquid market. This feature is important because every financial market 
provides a single sample path, the set of the registered quotes of the asset. To 
be sure that a statistical description makes sense, a reasonable requirement 
is that this (unique) sequence of quotes is not dominated by single events or 
single trader's operations. A natural candidate for such (if possible) description 
is then a liquid market involving several billions of dollars, daily traded by 
thousand of actors. In addition the foreign exchange market has no business 
time limitations. Many market makers have branches worldwide so trading 
can occur almost continuosly. So in the analysis one can avoid to consider 
problems involved in the opening and closure of a particular market, at least 
as a first approximation. Finally if one considers the currency exchange, the 
returns 

r* = ln%i (1) 



are almost symmetrically distributed, where St is the price at the time t de- 
fined as the average between bid and ask prices. 

In this paper we investigate the possibility to describe the Deutschemark/US 
dollar exchange (the most liquid market) in term of a Markov process. We 
consider a high frequency datasct to have statistical relevance of the results. 
Our data, made available by Olsen and Associated, contains all worldwide 
1,472,241 bid-ask Deutschemark/US dollar exchange rate quotes registered 
by the inter-bank Reuters network over the period October 1, 1992 to Septem- 
ber 30, 1993. 

One of the main problems when analyzing financial series is that the quotes 
are irregulary spaced. In section 2 we briefiy describe some different ways to 
introduce time in finance, and we discuss why we chose the business time, 
i.e. the time of a transaction is the position in the sequence of the registered 
quotes. 

The history of the efforts in the proposal of proper stochastic processes for 
price variations is very long. An efficient foreign currency market, i.e. where 
prices reflect the whole information, suggests that returns are independently 
distributed. Following Fama (1970) we shall call hereafter "random walk" a fl- 
nancial model where returns are independent variables. Without entering into 
a detailed review we recall the seminal work of Bachelier (1900) who assumed 
(and tested) that price variations follow an indipendent gaussian process. Now 
it is commonly believed that returns do not behave according to a gaussian. 
Mandelbrot (1963) has proposed that returns are Levy-stable distributed, still 
remaining independent random variables. A recent proposal is the truncated 
Levy distribution model introduced by Mantegna and Stanley (1995) which 
well fits financial data, even considering them at different time lags. 
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Because of the financial importance of correlations for arbitrage opportunities 
(see Pagan (1996) for a review and Baviera et al. (1999) for more recent re- 
sults) it is essential to introduce a non questionable technique able to answer 
the question: can "random walk" models correctly describe return variations? 
In section 2 we show that "random walk" is inadequate to describe even quali- 
tatively some important features of price behavior. We measure the probability 
distribution of exit time, i.e. the lag to reach a given return amplitude A. This 
analysis not only shows the presence of strong correlations but also leads to a 
natural measure of time which is intrinsic of the market evolution, i.e. the time 
in which the market has such a fluctuation. Following Baviera et al. (1999) we 
call this time A-trading time. 

In section 3, measuring the time in A-trading time, we discuss the validity of 
a markovian approximation of the market behavior. First, using the quotes, 
we build a Markov model of order m, then, starting from the usual correlation 
function approach, we consider several techniques to verify the quality of this 
description. 

We also perform an entropic analysis, inspired by the Kolmogorov (1956) e- 
entropy. This kind of analysis is equivalent to consider a speculator who cares 
only of market fluctuation of a given size A (see Baviera et al. (1999)). 
Finally we use other statistical tools to test the vahdity of such an approx- 
imation, namely the mutual information introduced by Shannon (1948) and 
the Kullback and Lciblcr (1956) entropy which measures the "discrepancy" 
between the Markov approximation and the "true" return process. 
In section 4 we summarize and discuss the attained results. 



II Exit times 

One of the main (and unsolved) problems in tick data analysis concerns the 
irregular spacing of quotes. There are several candidates to measure the time 

of each transaction. 

The first one is obviously the calendar time, i.e. the Greenwich Meridian Time 
at which the transaction occurs. The trouble with such a choice is rather evi- 
dent: there arc periods of no transaction. The simplest way (see, e.g. Mantegna 
and Stanley (1995)) to overcome this difficulty is to cut "nights" and "week- 
ends" from the signal, i.e. assuming a zero time lag between the closure and 
reopening of the market. Of course in a worldwide series is less evident what 
does it mean "night" or "weekend", but it is easy to observe that during a 
day or a week there are lags when no transaction is present. An improvement 
of the above procedure is to rescale the calendar time with a measure of mar- 
ket activity, i.e. to create a new time scale under which in all lags the same 
market activity occurs. We do not enter here in the literature on this subject, 
we briefly mention the procedure in which one measures the market activity 
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with the average absolute number of quotes per lag (e.g. of 15 minutes) or 
with the average absolute price change over each lag (sec Dacorogna et al. 
(1993), Fischer et al. (1997)). Let us note that the weight does not change too 
much the time (roughly it is between 0.5 and 2.5), therefore there is not a big 
difference with the naive procedure. A similar approach has been introduced 
by Galluccio ct al. (1997) where the time lags are rescaled with a properly 
defined instantaneous volatility. 

A slightly different approach is the business time. One considers all transac- 
tions equivalent and the time of the transaction is simply given by its position 
in the sequence of quotes. In this section we shall adopt the business time; it 
looks a reasonable choice when facing a worldwide sequence where lags of no 
transaction are often a consequence of the geografical position on the earth 
surface of the most important markets. However we have to stress that, at 
least for some statistical features, there are not quahtatively differences using 
the business or the calendar time. 

In this section we study the distribution of the exit times at a given resolution 
A. This analysis will allow us to show that the "random walk" models cannot 
be a reasonable description. This technique will help us to understand how to 
analyze financial time series. 
Introducing 

rMo^ln|^, (2) 

where to is the initial business time and t > to, we wait until ti such as : 

K,to\ > A . (3) 

Now, starting from ti with the above procedure we obtain St2, and so on. In 
this way we construct the successions of returns and exit times at given A : 

S 

{pi,P2, . . . , Pfc, . . .} where pk = In — ^ (4) 

and 

{ti, Ta, . . . , Tfc, . . .} where Tk = tk - tfc-i , (5) 

where > A by definition (see eq.(3)), and is the time after which we 
have the k-th fluctuation of order A in the price. 

In the following we call k the A-trading time, i.e. we enumerate only the 
transactions at which a fluctuation A is reached. Since the distribution of the 
returns is almost symmetric, the threshold A has been chosen equal for both 
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positive and negative values. 
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Fig. 1. Evolution of rt^tk with A = 0.01. The A-trading time is zero {k = 0) at the 
first transaction corresponding to 00:00:14 of October 1, 1992 in calendar time, and 
A-trading time is 4 (A; = 4) at 11:59:28 of October 2, 1992 (9939 business time). 

In the figure 1 we show the evolution of the returns . 
Let us now study P^{t), the probability distribution function (PDF) of exit 
times (5) for a given size A. From the shape of this PDF one can have indica- 
tion if a stochastic process can be considered a good candidate to model price 
variation. In figure 2 we show that (r) has a different shape for A smaller 
or larger than the typical transaction cost 7-^, where we define 
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the transaction cost at time t, whose distribution has a narrow peak around 
its typical value 7^ = 2.4 • 10~^. Note that i^(r) is roughly exponential at 
small A, while for A larger than 7^, i^(r) clearly shows a non-exponential 
shape. 

In the insert we present (r) vs. A. We denote with (•) the average of a sequence 



(A) 



1 ^ 

^ 1=1 



(6) 



where L is the size of the sequence. One has a fairly clear scaling law of the 
average exit time as a function of A for more than three decades in (r) : 



(r)^ ~ A" with a ~ 2.2 . 



(7) 



for all A greater than the typical transaction costs 7^. 

Let us now compare the PDF from the data analysis with the result one 
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Fig. 2. Probability distribution function of the exit times r for different values of 
A : A = 5-10-5 (full line), A = 2- IQ-^ (dashed line) and A = 6- IQ-^^ (dotted line). 
Those values are respectively smaller, similar to and larger the typical transaction 
costs. With the dashed-dotted line, we also show the PDF of the exit times for a 
Wiener process (see eq. (8)) with the (r) = 16.85 equal to that one obtained with 
A = 6 • IQ-^. In the insert we show (r) versus A. The line shows the asyntotic 
behavior (for A > 7-^) of (r) oc A" with a = 2.2. 

obtains in "random walk" models. In the case of a Wiener process in absence 
of drift, following Feller (1971) one can find the exact solution for the PDF : 

^rw=-^f:(-i)"(2"+iK'^- (8) 

n=0 

where f = (8A^)/(7r^(T^) and is the variance of the Wiener process. The 
main characteristics of this PDF are : 

• it is peaked around (r). The similarity between average and typical value 
of this distribution allows us to interpret the average exit time as the one 
we expect to observe in the future with higher probability. This fact is not 
any more true in the case of a PDF with a power-law tail, and so it is more 
difficult to give a direct interpretation of the (r) shown in the insert. 

• It decreases exponentially at large r. It is simple to understand that the 
exponential decay is true for any "random walk" model. In fact, in absence 
of correlations one has a finite probability, let us say less than q, to exit in 
a finite time f. Therefore the probability to exit after a time r is less than 
g^/"^, i.e. it is exponential. In figure 2 we also compare the normalized PDF 
computed from financial data with the (r) of a "random walk" with the 
(r) equal to the one obtained from the signal with A = 6 • 10"^. We recall 
that 7^ ~ 2.4 • 10-1 
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A model built with independent variables is unable to reproduce P^{t) and it 
is simple to understand that also processes with short memory, e.g. Markov 
process, give for the exit times a PDF qualitatively similar (i.e. decaying ex- 
ponentially at large r) to that one of a Wiener process. 
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Fig. 3. Data collapse of Pa{t/{t)) vs t/{t). We have plotted four rescaled PDF : 
A = 4 • 10-^, A = 6 • 10-4, A = 8 • IQ-'', and A = 1 • IQ-^. The full curve indicates 
the Cauchy distribution. 

Figure 3 shows the PDF data collapse, i.e. P^{r/ (r)) vs r/ (r) for different A, 
all greater than 7^. 

One observes that the rescaled PDF behave according to a single density 
function, which is well approximated, a part for large t/{t), by the Cauchy 
distribution 
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(9) 



The distribution (r) shows a power-law tail up to an exponential cut (per- 
haps due to the finite size of the sample analyzed) . We note that if A is larger 
than the typical transaction cost the probability distribution of r has a very 
long tail and so {t)a is not the typical value of the sequence (5). 
The scaling behavior of (r) and the form of the distribution P^{r) indicate 
the presence of strong correlations in the financial signal for A > 7^. Further- 
more the above analysis indicates the A-trading time as a natural candidate 
to measure time. In the financial context this measure has the great advan- 
tage to be intrinsic of the market evolution, i.e. it enumerates the times in 
which the market has such a fluctuation. In the next section we shall show 
that, measuring the time in such a way, a simple Markov process is a valid 
description of the market behavior. 
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Ill The markovian approximation 



In this section we construct a markovian process of order m using a symbohc 
sequence obtained from the returns {pk} at fixed A. We define the markovian 
process building the transition matrix with this symbohc sequence. 
Symbolic dynamics is a rather powerful tool to catch the main statistical 
features of time series. In order to construct a symbolic sequence from the 
succession {pk} we need a coarse graining procedure to partition the range 
data and then we assign a conventional symbol to each element of the partition. 
We perform the following transformation : 



In such a way we may study a discrete stochastic process which reproduces the 
feature of the original process we want to analyze. The financial meaning of this 
codification is rather evident: the symbol —1 occurs if the stock price decreases 
of percentage A, while if the stock price increases of the same percentage the 
symbol is 1. In the following we indicate with z^^^ the two possible values of 
the symbolic sequence. 

Let us briefly explain the flnancial meaning of the return sequence {pk} at 
fixed A. A speculator, who modifles his portfolio only when a fluctuation of 
size A appears in the price sequence, cares only of these returns. Following 
Baviera et al. (1999) we call patient investor such a speculator. He performs 
automatically a filtering procedure : he rejects all the quotes which do not 
change at least of a percentage A of the price. 

Starting from this sequence we create a Markov process approximating the 
symbolic sequence. In a Markov process of order m the probability to have 
the symbol Zn at the step n depends only on the state of the process at the 
previous m steps n — l,n — 2, . . . ,n — m. 

Given a sequence of m symbols Zm — {z^'^\ z^^^\ . . . , z^"'^, we define N{Zjji) 
the number of sequences and N{Zm,j) the number of times the symbol 
2^-^^ comes after the sequence Z^. The transition matrix of the Markov process 
of order m is : 



-1 if Pfe < 
+1 if Pfe > 



(10) 






N{Zm) 



and the probability of the sequence Z^ is : 



P{Zm) 



NjZm) 



(12) 
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where Mm — L — ni is the total number of possible sequence of length m in- 
cluding superposition (L is the size of the sequence). It can be shown (see e.g. 
Feller (1971)) that the definitions (11) and (12) are coherent in the framework 
of ergodic Markov processes. 

In the case of a Markov chain (i.e. process of order 1) the transition matrix 
Wij, i.e. the probability of a transition in one step to the state z^^-* starting 
from the state z^'^'> , contains all the relevant information for the process. Nam- 
ing N(i) the number of symbol and N{i,j) the number of symbol z^^'^ 
which comes after the symbol z^'^ , the transition matrix is : 



If the process is ergodic the probability Pi of the state z^'^^ is given by 

i=l 



where is the n-th power of the matrix W. 

In the following we check, using various statistical approaches, if the markovian 
approximation of order m mimics properly the price behaviour. First of all 
we perform an autocorrelation analysis, which is the most common test in 
financial econometrics. Then we show that the model reproduces properly the 
Shannon entropy. This quantity, as discovered by Kelly (1956) and shown in 
Baviera et al. (1999), has a clear financial meaning and it plays a central role 
in this field. 

Finally we test the validity of our approximation using more sophisticated 
statistical tools of information theory. We compare the mutual information 
of the signal to the one of the approximation and we measure a "distance" 
between the symbolic dynamics process and the Markov approximation, with 
a technique based on the KuUback entropy. 

The results we show in the following analysis have been obtained choosing 
A = 4 • 10~^, but they are totally independent from this choice for A > 7^. 



A Autocorrelation function 



A standard approach to verify temporal indipendence of processes is the mea- 
sure of autocorrelation function. A first simple similarity test can be based on 
it : if two processes have the same autocorrelation function they lose memory 
of their past in a similar way. 

Tipically one defines short memory series if an exponential decay of autocor- 
relation function occurs, as in Brockwell and Davis (1991). 
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The autocorrelation function of a random process Xt is 



C{n) 



(14) 



where (.) indicates the sequence average introduced in equation (6). 
The autocorrelation function can be easily computed for a Markov chain de- 
scribed by the transition matrix Wij and the probability Pi of the state z^'^ : 



(15) 



If the Markov chain is ergodic one has 

C^^\n) ~ e(^°l^2l)" for large n , 



(16) 



where A2 is the second eigenvalue of the transaction matrix (see Feller (1971)). 
In figure 4 we compare the autocorrelation functions of the return {pk} and 



C(n). 
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Fig. 4. Autocorrelation function of return sequence {pk} (+), Markov sequence of 
order 1 (x) and Markov sequence of order 3 (*). The line is the asymptotic analytical 
result for a Markov process of order 1 (see eq.(16)). 

of the sequences of the same length generated by Markov processes of various 
orders. We show also the theoretical value for a Markov chain (see equa- 
tion (16)). We observe there is a very good agreement between the return 
sequence and the Markov process inside the statistical error. This error, for 
the autocorrelation function measured from a finite dataset, is of the order of 
0(L/rc)~2 where is the correlation time and L is the length of the sequence 
(L ~ 160000 for A = 4 • 10"^). This corresponds to the value of the "plateau" 
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in the figure 4. 

The same results are obtained if one computes autocorrelations of the symbolic 
return sequence {zk}- This is a consequence of the fact that the probability 
density function of pk is peaked around the value ±A. 

B Entropic analysis 

Let us briefly recall some basic concepts of information theory and discuss the 
meaning of entropy in financial data analysis. 

Given a symbolic sequence = {z^^^\ z^^'^\ . . . , z^^"-^} of length n with prob- 
ability p{Zn), we define the block entropy if„ as 

Hn = -^p{Zn)lnp{Zn) . (17) 

The difference 

hn = Hn+l — Hn (18) 

is the average information needed to specify the symbol Zn+i given the previous 
knowledge of the sequence {^i, 2:2, ... , Zn}- The series of hn is monotonically 
not increasing and for an ergodic process one has 

h = lim hn (19) 

n— >oo 

where h is the Shannon (1948) entropy. 

The maximum value of h is ln(2) (this is because we are considering only two- 
symbols sequence). This value is reached if the process is totally uncorrelated 
and the symbols have the same probability. We indicate, following Baviera 
et al. (1999), with available information the difference between the maximum 
entropy and its real value : 

7 = ln(2)-/i . 

Khinchin (1957) shows that if the stochastic process {zi, Z2, ■ ■ ■} is markovian 
of order m then hn = h for n > m. We observe that the Markov process of 
order m described by equations (11) and (12) has, by definition, the same 
entropy hn of the symbolic sequence for all n not greater than m. In this way 
we can build a markov approximation which mimics the originary entropy as 
we desire. 

From figure 5 one observes that the /i„ are consistent with those ones of a 
Markov process of order 1 (the asymptotic value h is reached approximatively 



11 




5 10 15 20 n 



Fig. 5. hn versus n for the symbolic return sequence {zk} (x), the Markov sequence 
of order 1 (+) and of order 3 (□) (these entropies are almost indistinguishable). 
We also plot the /i„ for a random walk (*) with its theoretical value log(2). Both 
the Markov process and the random walk have the same number of elements of the 
financial data with A = 4 • 10~^. 

in one step). It is therefore natural to conjecture that such a stochastic process 
is able to mimic price variations. Using Markov processes of order greater than 
one does not improve the approximation for the entropy value in a appreciable 
way. The value of hn is statistically relevant till the length n of the sequence 
is of the same order oi (l/h) log{L) as shown in Khinchin (1957): this explains 
the folding of /i„ at large n. 

Why available information is so important in the financial context? 
Kelly (1956) has shown the link between available information and the opti- 
mal growth rate of a capital in some particular investments. A similar idea 
can be applied to the patient investor, i.e. a speculator who waits to modify 
his investment till a fluctuation of size A is present. He observes an available 
information different from zero. Instead, as shown in figure 5, for a random 
walk the available information is zero. 

If the returns {pu} at fixed A are ruled by a Markov chain and if one neglets 
the transaction costs, Baviera et al. (1999) prove that the optimal growth rate 
of the capital is equal to the available information. The case with transaction 

costs is considered by Baviera and Vergni (1999). 

Inside a markovian description the available information suggests also the or- 
der of the process one has to consider. It is useless to include the information 
coming from one further step in the past, if one does not observe a significant 
increasing of the available information involved in the operation. 
Furthermore if the Markov approximation well describes the available infor- 
mation, it is not so important for a speculator who wants to maximize the 
grow rate of his capital that the markovian mimicking is no longer good for 
other quantities. However we shall show in the following that the markovian 
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approximation not only reproduces the available information but, in addition 
it is a proper description of the return dynamics itself. This is important in 
the case one performs a more complex investment for which a detailed model 
for price variation is essential (see Merton (1990)). 



C Mutual information 



The mutual information is a measure of the average information one has about 
an event q knowing the resuh of another event s. In our case the events are 
the values of a process at different time. 
Following Shannon (1948) we define : 

/(„) = - / / P(.,, lu ^£|g±^dx.dx.,„ (20) 



where P{xt,Xt+n) is the joint probability of the variables Xt and Xt+n, a-nd 
P{xt) is the probability density function of the Xt- 

The main advantage of this tccnique, compared with autocorrelation function, 
is that /(n) is an intrinsic property of the process, i.e. it has the same value 
if we use Xt or a function of it. This because /(n) depends on the probability 
density function in such a way that the integral (20) is invariant under the 
change of variable x ^ y = f{x) . 
For a Markov chain the mutual information is : 

/W(n) = -J:P^{Wn^,ln^^^^ (21) 

and, if the Markov chain is ergodic, for large n one has 

/W(n) ~ e^C'^l^^l)" (22) 



where A2 is again the second eigenvalue of the transaction matrix. 
In figure 6 wc compare the mutual information of the symbolic return series 
{zk} with the sequences obtained starting from Markov processes of various 
orders, and also with the expected theoretical value for a Markov chain (see 
equation (22)). In the case of the mutual information the statistical error is 
of the order (see also Roulston (1999)) for a sequence of length L, since 
it is computed starting from probability distribution. 

Let us stress that the good agreement between the return sequence and the 
Markov process for both mutual information (figure 6) and correlation function 
(figure 4) is a clear indicator of the validity of the Markov approximation. 
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Fig. 6. Mutual information of the symbolic return sequence {zk} (+), Markov se- 
quence of order 1 (x) and Markov sequence of order 3 (*). The line is the asymptotic 
analytical result for a Markov process of order 1 (see eq.(22)). 

D Kullback entropy 

Given two discrete random variables P, Q which can assume only M different 
value with probabilities Pi, Qi {i — 1 . . . M) respectively, the Kullback entropy 
of the probability distribution of the variable P with respect to Q : 

J(P|g)^^p,ln^ (23) 

i=l 

is a powerful tool to measure the "distance" between the PDF of those vari- 
ables. In fact, it is shown by Kullback and Leibler (1956) that the function 
J{P\Q) is identically zero only if the two random variables have the same 
probabilities, i.e. Pi = Qi V i; otherwise J{P\Q) > 0. The J{P\Q) is not 
a simmetric function of the two random variable. To define symmetric "dis- 
tance" between PDF, following Kullback and Leibler (1956) we define the 
divergence between {pi} and {qi} as : 

^ V ^ a- 

K{P\Q) ^ J{P\Q) + J{Q\P) = J2ip, - q,) In ^ ^ ^(g, - p,) In ^ (24) 

i=l i=l P^ 

This function is still positive definite and it is also symmetric. 
This is a probabilistic "distance" between random variables, but we are inter- 
ested to test the similarity of two stochastic processes. To perform such a test 
we consider a symbolic sequence of length n, = {z^-'^\ z^^'^\ . . . , z^^"''} ob- 
tained from the processes we are interested in, and we compute the "distance" 
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in the KuUback way for the PDF of such sequences for all n. When n ^ oo we 
test the similarity of the processes as a whole, but, as happens for the entropy 
analysis, we expect for large n a limitation due to statistical errors. 
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Fig. 7. Kn{P,Q) versus n. The symbols indieates the order of the Markov process 
involved in the similarity test : (+) indicate the first order and (x) indicate the 
third order. The full line and the dashed line indicates the comparison between 
the theoretical Markov process with the symbolic return series and the theoretical 
Markov process with the symbolic Markov sequence (with the same number of 
elements of the financial data), respectively. 

In figure 7 we show Kn{P\Q) versus n where P and Q are respectively the 
Markov process and the financial process. The sequence Z„ for the financial 
quotes are obtained from the symbolic return sequence {zk}, and its probabil- 
ity is numerically computed as in equation (12). For the Markov process the 
probabilities of the are calculated starting from the transition matrix W. 
In the case of a Markov chain we have : 

Pn = Pn ■ W,,,,, ■ Wj,j, ■ ■ ■ Wj^_,j„ 

We have also calculated the KuUback entropies between the PDF of the 
Markov process computed theoretically using the transition matrix, and nu- 
merically with a symbolic sequence (of the same length of the financial se- 
quence). This shows the relevance of the statistical error, and gives an indi- 
cation of the order of n at which one must stop to have statistically relevant 
results. 

All the KuUback tests are performed using Markov process of order 1 and 3 
and is evident from figure 7 how this last process well reproduces the financial 

series. 

For the KuUback entropies the statistical error increases with n. In order to 
have reasonable statistics we must restrict ourselves to take n < (l/h) log(L), 
as in the case of entropy analysis. 




15 



At the end of this section we recall that the symbolic return serie {zk} is cho- 
sen at fixed A, but the results of our analysis, i.e. the good agreement with a 
Markov process, is strongly independent from the value of A. 



IV Conclusions 

In econometric analysis it is obiously relevant to test the validity of "ran- 
dom walk" models because of their strict link with market efficiency. In this 
paper we have first discarded "random walk" models as proper description 
of some features of the financial signal and then built a model considering 
the Deutschemark/US dollar quotes in the period from October 1, 1992 to 
September 30, 1993. 

We have developed a technique, based on the measure of exit times PDF, 
which allows us to reject the random walk hypothesis. The presence of strong 
correlations has been observed by means of an "anomalous" scaling of (r) 
and the presence of a power law behavior of the exit times probability density 
function i^(r) for A > 7^. This implies the failure of the "random walk" 
models where (r) scales as A^ and the (r) tail is exponential. 
We want to interpret 7^ as a natural cut-off due to the absence of a profitable 
trading rule for a patient investor with A less than 7-^" : in this case profits 
are less than costs. We recall that a patient investor cares only of the quotes 
where it is present a price variation at least of a percentage A. 
The main advantage of this approach is that it shows that this class of models 
gives an inadequate description even at the qualitative level and it suggests a 
new point of view in financial analysis. 

Instead of considering an arbitrary measure of time we suggest to limit the 
analysis only at the times when something relevant from the financial point 
of view happens. In particular we focus our attention on return fluctuations 
of size A. This analysis is equivalent to the behavior of a patient speculator. 
We show that the returns of such a sequence can be approximated by a Markov 
model by means of several tests. We have considered the usual autocorrela- 
tion approach obtaining a very good agreement inside the statistical errors. In 
spite of its simplicity the autocorrelation analysis has the disadvantage that 
gives different results if one considers the random variable Xt or a function 
f{xt) of it. The mutual information is a generalization of this tool which does 
not depend from the function / considered. The agreement observed for the 
mutual information is surely a severe test of similarity. 

A central role in the comparison between the approximation and the "true" 
signal is surely played by the available information. Following the idea of Kelly, 
it has been shown by Baviera et al. (1999) that this quantity corresponds, in 
absence of transaction costs, to the optimal growth rate of the invested capital 
following a particular trading rule. We show that even a Markov process of 
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order 1 mimics properly the Deutschemark/US dollar behavior. 
Finally we have analized a "distance" between processes based on the KuU- 
back and Leibler entropy. The advantage of this technique is that one is able 
to estimate the difTerence between the processes with a quantity strictly con- 
nected with Shannon entropy and then with the available information, i.e. the 
quantity of interest for a patient speculator. 

The Markov model we have considered in this paper not only has the advan- 
tage to mimic very well the available information of the financial series but 
also it is a good approximation of the "true" process itself. While the former is 
the quantity of interest for a speculator who invests directly on the exchange 
market, the latter is more interesting from both the theoretical and experi- 
mental sides to have a deeper insight of the market behavior. This asset-model 
will allow to reach a better evaluation of risk with the natural consequences 
on the derivative field. 
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